A Multimodal Listener Behaviour Driven by Audio Input
نویسندگان
چکیده
Our aim is to build a platform allowing a user to chat with virtual agent. The agent displays audio-visual backchannels as a response to the user’s verbal and nonverbal behaviours. Our system takes as inputs the audio-visual signals of the user and outputs synchronously the audio-visual behaviours of the agent. In this paper, we describe the SEMAINE architecture and the data flow that goes from inputs (audio and video) to outputs (voice synthesizer and virtual characters), going through analysers and interpreters. We focus, more particularly, on the multimodal behaviour of the listener model driven by audio input.
منابع مشابه
CONTROL OF CHAOS IN A DRIVEN NON LINEAR DYNAMICAL SYSTEM
We present a numerical study of a one-dimensional version of the Burridge-Knopoff model [16] of N-site chain of spring-blocks with stick-slip dynamics. Our numerical analysis and computer simulations lead to a set of different results corresponding to different boundary conditions. It is shown that we can convert a chaotic behaviour system to a highly ordered and periodic behaviour by making on...
متن کاملMultimodal Feedback from Robots and Agents in a Storytelling Experiment
In this project, which lies at the intersection between Human-Robot Interaction (HRI) and Human-Computer Interaction (HCI), we have examined the design of an open-source, real-time software platform for controlling the feedback provided by an AIBO robot and/or by the GRETA Embodied Conversational Agent, when listening to a story told by a human narrator. Based on ground truth data obtained from...
متن کاملIncremental Multimodal Feedback for Conversational Agents
Just like humans, conversational computer systems should not listen silently to their input and then respond. Instead, they should enforce the speaker-listener link by attending actively and giving feedback on an utterance while perceiving it. Most existing systems produce direct feedback responses to decisive (e.g. prosodic) cues. We present a framework that conceives of feedback as a more com...
متن کاملLearning Spoken Words from Multisensory Input
Speech recognition and speech translation are traditionally addressed by processing acoustic signals while nonlinguistic information is typically not used. In this paper, we present a new method which explores the spoken word learning from naturally co-occurring multisensory information in a dyadic(two-person) conversation. It has been noticed that the listener always has a strong tendency to l...
متن کاملTOWARDS MULTIMODAL CONTENT REPRESENTATION Discussion paper
Multimodal interfaces, combining the use of speech, graphics, gestures, and facial expressions in input and output, promise to provide new possibilities to deal with information in more effective and efficient ways, supporting for instance: the understanding of possibly imprecise, partial or ambiguous multimodal input; the generation of coordinated, cohesive, and coherent multimodal presentatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010